Atrecss — Atr English Speech Corpus for Speech Synthesis

نویسندگان

Jinfu Ni

Toshio Hirai

Hisashi Kawai

Tomoki Toda

Keiichi Tokuda

Minoru Tsuzaki

Shinsuke Sakai

Ranniery Maia

Satoshi Nakamura

چکیده

This paper introduces a large-scale phonetically-balanced English speech corpus developed at ATR for corpus-based speech synthesis. This corpus includes a 16-hour American English speech data spoken by a professional male narrator in “reading style.” The contents of prompt sentences concern basically news articles, travel conversations, and novels. The prompt sentences were selected from huge collections of texts using a greedy algorithm to maximize the coverage of linguistic units, such as diphones and triphones. A few measures were taken to control undesirable recording variations in voice quality in the short term (daily) and long term (monthly) while recording the prompt sentences. Statistical figures of the corpus developed as well as those of subsets provided for Blizzard Challenge 2006 and 2007 are presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversations in the Real World

Abstract At ATR Spoken Language Translation Research Laboratories, we are building a broad-coverage bilingual corpus to study corpus-based speech translation technologies for the real world. There are three important points to consider in designing and constructing a corpus for future speech translation research. The first is to have a variety of speech samples, with a wide range of pronunciati...

متن کامل

Overview of Speech Translation at ATR

A speech translation system will transform a spoken dialogue from the speaker's language to the listener’s automatically and simultaneously. It will undoubtedly be used to overcome language barriers and facilitate communication among the peoples of the world. Creation of such a system will first require developing the various constituent technologies: speech recognition, machine translation, an...

متن کامل

NICT-ATR Speech-to-Speech Translation System

This paper describes the latest version of speech-to-speech translation systems developed by the team of NICT-ATR for over twenty years. The system is now ready to be deployed for the travel domain. A new noise-suppression technique notably improves speech recognition performance. Corpus-based approaches of recognition, translation, and synthesis enable coverage of a wide variety of topics and ...

متن کامل

Speech Technology and Corpus Development in Thailand

This paper describes some recent activities on speech technology and corpus development in Thailand. Many speech corpus projects have been launched this year. The National Electronics and Computer Technology Center (NECTEC) recently provides a grant for two cooperative speech corpus projects to interested universities. The first project aims at developing a Thai speech corpus for the research o...

متن کامل

XIMERA: a new TTS from ATR based on corpus-based technologies

This paper describes a new concatenative TTS system under development at ATR. The system, named XIMERA, is based on corpus-based technologies, as was the case for the preceding TTS systems from ATR, namely ν-talk and CHATR. The prominent features of XIMERA are (1) large corpora (a 110hours corpus of a Japanese male, a 60-hours corpus of a Japanese female, and a 20-hours corpus of a Chinese fema...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Atrecss — Atr English Speech Corpus for Speech Synthesis

نویسندگان

چکیده

منابع مشابه

Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversations in the Real World

Overview of Speech Translation at ATR

NICT-ATR Speech-to-Speech Translation System

Speech Technology and Corpus Development in Thailand

XIMERA: a new TTS from ATR based on corpus-based technologies

عنوان ژورنال:

اشتراک گذاری